منابع مشابه
Point-Based Policy Iteration
We describe a point-based policy iteration (PBPI) algorithm for infinite-horizon POMDPs. PBPI replaces the exact policy improvement step of Hansen’s policy iteration with point-based value iteration (PBVI). Despite being an approximate algorithm, PBPI is monotonic: At each iteration before convergence, PBPI produces a policy for which the values increase for at least one of a finite set of init...
متن کاملSigma point policy iteration
In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear value function approximation. These algorithms rely on policy-dependent expectations of the transition and reward functions, which require all experience to be remembered and iterated over for each new policy evaluated....
متن کاملFixed Point Iteration Method
We discuss the problem of finding approximate solutions of the equation 0 ) ( x f (1) In some cases it is possible to find the exact roots of the equation (1) for example when ) (x f is a quadratic on cubic polynomial otherwise, in general, is interested in finding approximate solutions using some numerical methods. Here, we will discuss a method called fixed point iteration method and a part...
متن کاملSolving time-fractional chemical engineering equations by modified variational iteration method as fixed point iteration method
The variational iteration method(VIM) was extended to find approximate solutions of fractional chemical engineering equations. The Lagrange multipliers of the VIM were not identified explicitly. In this paper we improve the VIM by using concept of fixed point iteration method. Then this method was implemented for solving system of the time fractional chemical engineering equations. The ob...
متن کاملParallel Iteration of the Extended Backward Differentiation Formulas
The extended backward differentiation formulas (EBDFs) and their modified form (MEBDF) were proposed by Cash in the 1980s for solving initial-value problems (IVPs) for stiff systems of ordinary differential equations (ODEs). In a recent performance evaluation of various IVP solvers, including a variable-step-variable-order implementation of the MEBDF method by Cash, it turned out that the MEBDF...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Acta Mathematica
سال: 1981
ISSN: 0001-5962
DOI: 10.1007/bf02392866